Frequency warping by linear transformation of standard MFCC
نویسنده
چکیده
A novel linear transform (LT) is proposed for frequency warping (FW) with standard filterbank based MFCC features. Here, we use the idea of spectral interpolation of [9] to perform a continuous warping in the log filterbank output domain, and incorporate both interpolation and warping into a single warped IDCT matrix. The new transformation matrix is thus mathematically simpler than in [9], and no modification of standard MFCC feature extraction is required like the previous approach. In VTLN experiments with maximum likelihood score (MLS) estimation of the FW parameter, the new LT outperformed regular VTLN implemented by warping the Mel filterbank. In speaker adaptation experiments using the new LT to transform HMM means, the results were significantly better than MLLR for limited adaptation data and comparable to those in [8], while using the computationally simpler MLS FW estimation.
منابع مشابه
Frequency warping for VTLN and speaker adaptation by linear transformation of standard MFCC
Vocal Tract Length Normalization (VTLN) for standard filterbank-based Mel Frequency Cepstral Coefficient (MFCC) features is usually implemented by warping the center frequencies of the Mel filterbank, and the warping factor is estimated using the maximum likelihood score (MLS) criterion (Lee and Rose, 1998). A linear transform (LT) equivalent for frequency warping (FW) would enable more efficie...
متن کاملVocal tract normalization as linear transformation of MFCC
We have shown previously that vocal tract normalization (VTN) results in a linear transformation in the cepstral domain. In this paper we show that Mel-frequency warping can equally well be integrated into the framework of VTN as linear transformation on the cepstrum. We show examples of transformation matrices to obtain VTN warped Mel-frequency cepstral coefficients (VTN-MFCC) as linear transf...
متن کاملImplementing frequency-warping and VTLN through linear transformation of conventional MFCC
In this paper, we show that frequency-warping (including VTLN) can be implemented through linear transformation of conventional MFCC. Unlike the Pitz-Ney [1] continuous domain approach, we directly determine the relation between frequency-warping and the linear-transformation in the discrete-domain. The advantage of such an approach is that it can be applied to any frequency-warping and is not ...
متن کاملLinear transformation approach to VTLN using dynamic frequency warping
In the paper, we present a novel linear transformation approach to frequency warping during vocal tract length normalisation(VTLN) using the idea of dynamic frequency warping(DFW). Linear transformation among the mel-frequency cepstral coefficients (MFCC) provides computational advantage of not having to recompute features for each warp factor in VTLN. The proposed method uses the idea of separ...
متن کاملInvestigations on linear transformations for speaker adaptation and normalization
This thesis deals with linear transformations at various stages of the automatic speech recognition process. In current state-of-the-art speech recognition systems linear transformations are widely used to care for a potential mismatch of the training and testing data and thus enhance the recognition performance. A large number of approaches has been proposed in literature, though the connectio...
متن کامل